Robustness and risk-sensitivity in Markov decision processes

نویسنده

  • Takayuki Osogami
چکیده

We uncover relations between robust MDPs and risk-sensitive MDPs. The objective of a robust MDP is to minimize a function, such as the expectation of cumulative cost, for the worst case when the parameters have uncertainties. The objective of a risk-sensitive MDP is to minimize a risk measure of the cumulative cost when the parameters are known. We show that a risk-sensitive MDP of minimizing the expected exponential utility is equivalent to a robust MDP of minimizing the worst-case expectation with a penalty for the deviation of the uncertain parameters from their nominal values, which is measured with the Kullback-Leibler divergence. We also show that a risk-sensitive MDP of minimizing an iterated risk measure that is composed of certain coherent risk measures is equivalent to a robust MDP of minimizing the worst-case expectation when the possible deviations of uncertain parameters from their nominal values are characterized with a concave function.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

Recursive robust estimation and control without commitment

In a Markov decision problem with hidden state variables, a posterior distribution serves as a state variable and Bayes’ law under an approximating model gives its law of motion. A decision maker expresses fear that his model is misspecified by surrounding it with a set of alternatives that are nearby when measured by their expected log likelihood ratios (entropies). Martingales represent alter...

متن کامل

پیش بینی بیماری‌های کبدی با استفاده از مدل مارکف پنهان

Background: The liver is the largest internal organ and the most important organ after heart and brain in the human body without which life is impossible. Diagnosis of liver disease requires a long time and sufficient expertise of the doctor. Statistical methods can be classified as an automated forecasting system and help specialists for quickly and accurately diagnose liver disease. Hidden Ma...

متن کامل

Stormwater quality models: do they work?

Stormwater models underpin decision-making processes in stormwater management. Runoff generation and flow routing models are now well developed and widely adopted. However, stormwater quality models are less well developed. Model calibration and sensitivity analysis are crucial in order to estimate realistic stormwater pollution concentrations. The Metropolis algorithm (Metropolis et al., 1953)...

متن کامل

Developing a model for simulating urban expansion based on the concept of decision risk: A case study in Babol city

Today, the study of the spatial-temporal pattern of urban physical expansion and the identification of the parameters affecting the expansion play a crucial role in urban-related decision-making and long-term planning processes. Consequently, the use of precise and efficient methods to predict the physical expansion of urban areas is of great importance. The objective of present study is to pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012